import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import sympy as sy
sy.init_printing()"ggplot") plt.style.use(
=3)
np.set_printoptions(precision=True) np.set_printoptions(suppress
An eigenvector of an \(n \times n\) matrix \(A\) is a nonzero vector \(x\) such that \(Ax = \lambda x\) for some scalar \(\lambda\). A scalar \(\lambda\) is called an eigenvalue of \(A\) if there is a nontrivial solution \(x\) of \(Ax = \lambda x\), such an \(x\) is called an eigenvector corresponding to \(\lambda\).
Rewrite the equation,
\[ (A-\lambda I)x = 0 \]
Since the eigenvector should be a nonzero vector, which means:
- The column or rows of \((A-\lambda I)\) are linearly dependent
- \((A-\lambda I)\) is not full rank, \(Rank(A)<n\).
- \((A-\lambda I)\) is not invertible.
- \(\text{det}(A-\lambda I)=0\), which is called characteristic equation.
Consider a matrix \(A\)
\[ A = \begin{bmatrix} 1 & 0 & 0 \\ 1 & 0 & 1 \\ 2 & -2 & 3 \end{bmatrix} \]
Set up the characteristic equation,
\[ \text{det}\left( \begin{bmatrix} 1 & 0 & 0 \\ 1 & 0 & 1 \\ 2 & -2 & 3 \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \right) = 0 \]
Use SymPy charpoly
and factor
, we can have straightforward solutions for eigenvalues.
= sy.symbols(
lamda "lamda"
# 'lamda' withtout 'b' is reserved for SymPy, lambda is reserved for Python )
charpoly
returns characteristic equation.
= sy.Matrix([[1, 0, 0], [1, 0, 1], [2, -2, 3]])
A = A.charpoly(lamda)
p p
\(\displaystyle \operatorname{PurePoly}{\left( \lambda^{3} - 4 \lambda^{2} + 5 \lambda - 2, \lambda, domain=\mathbb{Z} \right)}\)
Factor the polynomial such that we can see the solution.
sy.factor(p)
\(\displaystyle \operatorname{PurePoly}{\left( \lambda^{3} - 4 \lambda^{2} + 5 \lambda - 2, \lambda, domain=\mathbb{Z} \right)}\)
From the factored characteristic polynomial, we get the eigenvalue, and \(\lambda =1\) has algebraic multiplicity of \(2\), because there are two \((\lambda-1)\). If not factored, we can use solve
instead.
sy.solve(p, lamda)
\(\displaystyle \left[ 1, \ 2\right]\)
Or use eigenvals
directly.
A.eigenvals()
\(\displaystyle \left\{ 1 : 2, \ 2 : 1\right\}\)
To find the eigenvector corresponding to \(\lambda\), we substitute the eigenvalues back into \((A-\lambda I)x=0\) and solve it. Construct augmented matrix with \(\lambda =1\) and perform rref.
- 1 * sy.eye(3)).row_join(sy.zeros(3, 1)).rref() (A
\(\displaystyle \left( \left[\begin{matrix}1 & -1 & 1 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right], \ \left( 0,\right)\right)\)
The null space is the solution set of the linear system.
\[ \left[ \begin{matrix} x_1 \\ x_2 \\ x_3 \end{matrix} \right]= \left[ \begin{matrix} x_2-x_3 \\ x_2 \\ x_3 \end{matrix} \right]= x_2\left[ \begin{matrix} 1 \\ 1 \\ 0 \end{matrix} \right] +x_3\left[ \begin{matrix} -1 \\ 0 \\ 1 \end{matrix} \right] \]
This is called eigenspace for \(\lambda = 1\), which is a subspace in \(\mathbb{R}^3\). All eigenvectors are inside the eigenspace.
We can proceed with \(\lambda = 2\) as well.
- 2 * sy.eye(3)).row_join(sy.zeros(3, 1)).rref() (A
\(\displaystyle \left( \left[\begin{matrix}1 & 0 & 0 & 0\\0 & 1 & - \frac{1}{2} & 0\\0 & 0 & 0 & 0\end{matrix}\right], \ \left( 0, \ 1\right)\right)\)
The null space is the solution set of the linear system.
\[ \left[ \begin{matrix} x_1 \\ x_2 \\ x_3 \end{matrix} \right]= \left[ \begin{matrix} 0\\ \frac{1}{2}x_3\\ x_3 \end{matrix} \right]= x_3\left[ \begin{matrix} 0 \\ \frac{1}{2} \\ 1 \end{matrix} \right] \]
To avoid troubles of solving back and forth, SymPy has eigenvects
to calcuate eigenvalues and eigenspaces (basis of eigenspace).
= A.eigenvects()
eig eig
\(\displaystyle \left[ \left( 1, \ 2, \ \left[ \left[\begin{matrix}1\\1\\0\end{matrix}\right], \ \left[\begin{matrix}-1\\0\\1\end{matrix}\right]\right]\right), \ \left( 2, \ 1, \ \left[ \left[\begin{matrix}0\\\frac{1}{2}\\1\end{matrix}\right]\right]\right)\right]\)
To clarify what we just get, write
print(
"Eigenvalue = {0}, Multiplicity = {1}, Eigenspace = {2}".format(
0][0], eig[0][1], eig[0][2]
eig[
) )
Eigenvalue = 1, Multiplicity = 2, Eigenspace = [Matrix([
[1],
[1],
[0]]), Matrix([
[-1],
[ 0],
[ 1]])]
print(
"Eigenvalue = {0}, Multiplicity = {1}, Eigenspace = {2}".format(
1][0], eig[1][1], eig[1][2]
eig[
) )
Eigenvalue = 2, Multiplicity = 1, Eigenspace = [Matrix([
[ 0],
[1/2],
[ 1]])]
NumPy Functions for Eigenvalues and Eigenspace
Convert SymPy matrix into NumPy float array.
= np.array(A).astype(float)
A A
array([[ 1., 0., 0.],
[ 1., 0., 1.],
[ 2., -2., 3.]])
.eigvals()
and .eig(A)
are handy functions for eigenvalues and eigenvectors.
np.linalg.eigvals(A)
array([2., 1., 1.])
# return both eigenvalues and eigenvectors np.linalg.eig(A)
EigResult(eigenvalues=array([2., 1., 1.]), eigenvectors=array([[ 0. , 0. , 0.408],
[ 0.447, 0.707, -0.408],
[ 0.894, 0.707, -0.816]]))
An Example
Consider a matrix \(A\)
= sy.Matrix(
A
[-4, -4, 20, -8, -1],
[14, 12, 46, 18, 2],
[6, 4, -18, 8, 1],
[11, 7, -37, 17, 2],
[18, 12, -60, 24, 5],
[
]
) A
\(\displaystyle \left[\begin{matrix}-4 & -4 & 20 & -8 & -1\\14 & 12 & 46 & 18 & 2\\6 & 4 & -18 & 8 & 1\\11 & 7 & -37 & 17 & 2\\18 & 12 & -60 & 24 & 5\end{matrix}\right]\)
Find eigenvalues.
= A.eigenvals()
eig eig
\(\displaystyle \left\{ 2 : 2, \ 3 : 1, \ \frac{5}{2} - \frac{\sqrt{1473}}{2} : 1, \ \frac{5}{2} + \frac{\sqrt{1473}}{2} : 1\right\}\)
Or use NumPy functions, show the eigenvalues.
= np.array(A)
A = A.astype(float)
A = np.linalg.eig(A)
eigval, eigvec eigval
array([ 21.69, -16.69, 3. , 2. , 2. ])
And corresponding eigenvectors.
eigvec
array([[-0.124, -0.224, 0. , -0.039, 0.611],
[ 0.886, -0.543, -0.894, -0.216, -0.149],
[ 0.124, 0.224, 0. , 0. , -0. ],
[ 0.216, 0.392, 0.447, 0.255, -0.462],
[ 0.371, 0.672, -0. , -0.942, 0.625]])
A Visualization Example
Let
\[ A= \left[ \begin{matrix} 1 & 6\\ 5 & 2 \end{matrix} \right] \]
find the eigenvalues and vectors, then visualize in \(\mathbb{R}^2\)
Use characteristic equation \(|A - \lambda I|=0\)
\[ \left| \left[ \begin{matrix} 1 & 6\\ 5 & 2 \end{matrix} \right] - \left[ \begin{matrix} \lambda & 0\\ 0 & \lambda \end{matrix} \right]\right|=0 \]
= sy.symbols("lamda")
lamda = sy.Matrix([[1, 6], [5, 2]])
A = sy.eye(2) I
- lamda * I A
\(\displaystyle \left[\begin{matrix}1 - \lambda & 6\\5 & 2 - \lambda\end{matrix}\right]\)
= A.charpoly(lamda)
p p
\(\displaystyle \operatorname{PurePoly}{\left( \lambda^{2} - 3 \lambda - 28, \lambda, domain=\mathbb{Z} \right)}\)
sy.factor(p)
\(\displaystyle \operatorname{PurePoly}{\left( \lambda^{2} - 3 \lambda - 28, \lambda, domain=\mathbb{Z} \right)}\)
There are two eigenvalues: \(7\) and \(4\). Next we calculate eigenvectors.
- 7 * sy.eye(2)).row_join(sy.zeros(2, 1)).rref() (A
\(\displaystyle \left( \left[\begin{matrix}1 & -1 & 0\\0 & 0 & 0\end{matrix}\right], \ \left( 0,\right)\right)\)
The eigenspace for \(\lambda = 7\) is
\[ \left[ \begin{matrix} x_1\\ x_2 \end{matrix} \right]= x_2\left[ \begin{matrix} 1\\ 1 \end{matrix} \right] \]
Any vector is eigenspace as long as \(x \neq 0\) is an eigenvector. Let’s find out eigenspace for \(\lambda = 4\).
+ 4 * sy.eye(2)).row_join(sy.zeros(2, 1)).rref() (A
\(\displaystyle \left( \left[\begin{matrix}1 & \frac{6}{5} & 0\\0 & 0 & 0\end{matrix}\right], \ \left( 0,\right)\right)\)
The eigenspace for \(\lambda = -4\) is
\[ \left[ \begin{matrix} x_1\\ x_2 \end{matrix} \right]= x_2\left[ \begin{matrix} -\frac{6}{5}\\ 1 \end{matrix} \right] \]
Let’s plot both eigenvectors as \((1, 1)\) and \((-6/5, 1)\) and multiples with eigenvalues.
= plt.subplots(figsize=(12, 12))
fig, ax = np.array(
arrows 0, 0, 1, 1]], [[0, 0, -6 / 5, 1]], [[0, 0, 7, 7]], [[0, 0, 24 / 5, -4]]]
[[[
)= ["darkorange", "skyblue", "r", "b"]
colors
for i in range(arrows.shape[0]):
= zip(*arrows[i, :, :])
X, Y, U, V
ax.arrow(0],
X[0],
Y[0],
U[0],
V[=colors[i],
color=0.08,
width=True,
length_includes_head=0.3,
head_width=0.6,
head_length=0.4,
overhang=-i,
zorder
)
# Eigenspace for λ = 7
= np.arange(-10, 10.6, 0.5)
x = x
y =2, color="red", alpha=0.3, label=r"Eigenspace for $\lambda = 7$")
ax.plot(x, y, lw
# Eigenspace for λ = -4
= np.arange(-10, 10.6, 0.5)
x = -5 / 6 * x
y =2, color="blue", alpha=0.3, label=r"Eigenspace for $\lambda = -4$")
ax.plot(x, y, lw
# Annotation Arrows
= "Simple, tail_width=0.5, head_width=5, head_length=10"
style = dict(arrowstyle=style, color="k")
kw
= mpl.patches.FancyArrowPatch(
a 1, 1), (7, 7), connectionstyle="arc3,rad=.4", **kw, alpha=0.3
(
)
plt.gca().add_patch(a)
= mpl.patches.FancyArrowPatch(
a -6 / 5, 1), (24 / 5, -4), connectionstyle="arc3,rad=.4", **kw, alpha=0.3
(
)
plt.gca().add_patch(a)
# Legend
= ax.legend(fontsize=15, loc="lower right")
leg 0.5)
leg.get_frame().set_alpha(
# Axis, Spines, Ticks
-10, 10.1, -10.1, 10.1])
ax.axis(["left"].set_position("center")
ax.spines["right"].set_color("none")
ax.spines["bottom"].set_position("center")
ax.spines["top"].set_color("none")
ax.spines["bottom")
ax.xaxis.set_ticks_position("left")
ax.yaxis.set_ticks_position(
ax.minorticks_on()="both", direction="inout", length=12, width=2, which="major")
ax.tick_params(axis="both", direction="inout", length=10, width=1, which="minor")
ax.tick_params(axis plt.show()
Geometric Intuition
Eigenvector has a special property that preserves the pointing direction after linear transformation.To illustrate the idea, let’s plot a ‘circle’ and arrows touching edges of circle.
Start from one arrow. If you want to draw a smoother circle, you can use parametric function rather two quadratic functions, because cicle can’t be draw with one-to-one mapping.But this is not the main issue, we will live with that.
= np.linspace(-4, 4)
x = np.sqrt(16 - x**2)
y_u = -np.sqrt(16 - x**2)
y_d
= plt.subplots(figsize=(8, 8))
fig, ax ="b")
ax.plot(x, y_u, color="b")
ax.plot(x, y_d, color
0, 0, s=100, fc="k", ec="r")
ax.scatter(
ax.arrow(0,
0,
5],
x[5],
y_u[=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="r",
ec="None",
fc
) plt.show()
Now, the same ‘circle’, but more arrows.
= np.linspace(-4, 4, 50)
x = np.sqrt(16 - x**2)
y_u = -np.sqrt(16 - x**2)
y_d
= plt.subplots(figsize=(8, 8))
fig, ax ="b")
ax.plot(x, y_u, color="b")
ax.plot(x, y_d, color
0, 0, s=100, fc="k", ec="r")
ax.scatter(
for i in range(len(x)):
ax.arrow(0,
0,
x[i],
y_u[i],=0.08,
head_width=0.27,
head_length=True,
length_includes_head=0.01,
width="r",
ec="None",
fc
)
ax.arrow(0,
0,
x[i],
y_d[i],=0.08,
head_width=0.27,
head_length=True,
length_includes_head=0.008,
width="r",
ec="None",
fc )
Now we will perform linear transformation on the circle. Technically, we can only transform the points - the arrow tip - that we specify on the circle.
We create a matrix
\[ A = \left[\begin{matrix} 3 & -2\\ 1 & 0 \end{matrix}\right] \]
Align all the coordinates into two matrices for upper and lower half respectively.
\[ V_u = \left[\begin{matrix} x_1^u & x_2^u & \ldots & x_m^u\\ y_1^u & y_2^u & \ldots & y_m^u \end{matrix}\right]\\ V_d = \left[\begin{matrix} x_1^d & x_2^d & \ldots & x_m^d\\ y_1^d & y_2^d & \ldots & y_m^d \end{matrix}\right] \]
The matrix multiplication \(AV_u\) and \(AV_d\) are linear transformation of the circle.
= np.array([[3, -2], [1, 0]])
A
= np.hstack((x[:, np.newaxis], y_u[:, np.newaxis]))
Vu = A @ Vu.T
AV_u
= np.hstack((x[:, np.newaxis], y_d[:, np.newaxis]))
Vd = A @ Vd.T AV_d
The circle becomes an ellipse. However, if you watch closely, you will find there are some arrows still pointing the same direction after linear transformation.
= plt.subplots(figsize=(8, 8))
fig, ax
for i in range(len(x)):
ax.arrow(0,
0,
0, i],
AV_u[1, i],
AV_u[=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="darkorange",
ec="None",
fc
)
ax.arrow(0,
0,
0, i],
AV_d[1, i],
AV_d[=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="darkorange",
ec="None",
fc
)-15, 15, -5, 5])
ax.axis([ plt.show()
We can plot the cirle and ellipse together, those vectors pointing the same direction before and after the linear transformation are eigenvector of \(A\), eigenvalue is the length ratio between them.
= 50
k = np.linspace(-4, 4, k)
x = np.sqrt(16 - x**2)
y_u = -np.sqrt(16 - x**2)
y_d
= plt.subplots(figsize=(16, 8))
fig, ax
0, 0, s=100, fc="k", ec="r")
ax.scatter(
for i in range(len(x)):
ax.arrow(0,
0,
x[i],
y_u[i],=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="r",
ec=0.5,
alpha="None",
fc
)
ax.arrow(0,
0,
x[i],
y_d[i],=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="r",
ec=0.5,
alpha="None",
fc
)
= np.array([[3, -2], [1, 0]])
A
= np.hstack((x[:, np.newaxis], y_u[:, np.newaxis]))
v = A @ v.T
Av_1
= np.hstack((x[:, np.newaxis], y_d[:, np.newaxis]))
v = A @ v.T
Av_2
for i in range(len(x)):
ax.arrow(0,
0,
0, i],
Av_1[1, i],
Av_1[=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="darkorange",
ec="None",
fc
)
ax.arrow(0,
0,
0, i],
Av_2[1, i],
Av_2[=0.18,
head_width=0.27,
head_length=True,
length_includes_head=0.03,
width="darkorange",
ec="None",
fc
)= 7
n -2 * n, 2 * n, -n, n])
ax.axis([ plt.show()